Skip to content

Comments

Import sl benchmark project#124

Open
joellabes wants to merge 7 commits intomainfrom
import-sl-benchmark-project
Open

Import sl benchmark project#124
joellabes wants to merge 7 commits intomainfrom
import-sl-benchmark-project

Conversation

@joellabes
Copy link
Collaborator

Bring the semantic layer benchmark project from https://roundup.getdbt.com/p/semantic-layer-as-the-data-interface into ADE-bench.

  • Fuzzy matching of result tables
  • Every task is split into its own directory so there's a lot of boilerplate, but I decided that was worth it. The alternative was to have a single file which contained all the tasks and gold queries, but that would remove flexibility if we want to modify the project during setup (e.g. to remove the existing semantic models and have the agent recreate them).
  • New source database so we'll have to add that to the shared google drive :/ acme_insurance.duckdb.zip

@joellabes joellabes requested review from b-per and bstancil February 12, 2026 03:47
joellabes and others added 2 commits February 12, 2026 16:49
…permissions

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
@bstancil
Copy link
Collaborator

Two main things here:

  1. For the tasks, the prior convention was to use the same task prefix for any tasks that shared the same project. So, if the project is acme_insurance, then the tasks would be acme_insurance_xxx. This breaks that, both in calling the tasks things like free_text_hqls_001, and by having different suffixes (hqls, hqhs, etc). Ultimately, if there's a good reason for that, great, but we should probably do that intentionally if we do.
  2. There are already some support for fuzzy matching in the harness. It should support things like mismatched column ordering and row ordering, and there's support for approximate equality in numeric fields. (https://github.com/dbt-labs/ade-bench/blob/main/docs/CONTRIBUTING.md#approximate-equality-tests). I'm not sure if this will cover what y'all need, but it might be good to unify those so that it's all handled through a single mechanic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants